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The Marking System of the 
College Entrance Examination Board 

This study represents an investigation into the distribution of the 
marks of the College Entrance Examination Board for the years 1902 
to 1920 inclusive. It was made in order to discover if there were any 
grounds for the strong criticism of the college entrance examinations by 
New England educators, more especially secondary school principals 
and teachers. It is published at this time because the Board in its 
Twentieth Annual Report recognized the existence of sudden and violent 
fluctuations, from year to year, in the results of the examinations, in 
many subjects, and voted to employ expert assistance to aid in determin- 
ing the specific causes. 

SCOPE OP THE STUDY. 

The subjects selected were English Readings, Elementary French, 
Elementary Algebra and Plane Geometry for the reason that they were 
offered by nearly all candidates, thus involving a relatively large num- 
ber of cases. The arrangement of marks has been altered somewhat. A 
sample distribution as published by the board is as follows : 

Solid Geometry 90-100 75-89 60-74 50-59 40-49 0-39 

1916/1152* 1.8% 6.1% 18.2% 12.8% 14.1% 47% 

Most of the larger colleges and universities admit on a mark of 60 
or above while some of the smaller institutions will accept as low as 50. 
Assuming that the distribution ought to approximate the normal, for 
reasons which will be established later, and that anyone rated below 50 
has failed to pass, the data in each case have been corrected from the 
above to read as follows: 

Solid Geometry 1916/1152 90-100 7589 60-74 50-59 0-49 

1.8% 6.1% 18.2% 12.8% 61.1% 

The highest number of cases involved in any distribution was Ele- 
mentary Algebra 1920/5249 and the lowest Elementary French 1902/509 
with only 13 out of the 76 instances when the number fell below 1000. 

FACTS BROUGHT TO LIGHT. 

The following significant facts were discovered: 

(a) Out of 76 distributions graphed every one is bimodal vidth 
the exceptions of : 

English Readings 1902/800, 1906/1380, 1907/1661, 1908/1698, 
1912/1731. 



* In this and all similar cases the numerator of the fraction represents the year and 
the denominator the number of persons taking the examination. 
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In every instance the second mode in the distribution occurs in 
the assignment of the lowest marks and very often contains a greater 
percentage of cases than the one in the middle. 

(6) Every distribution is skewed negatively or toward the lower 
end of the distribution of marks except : 

Elementary Algebra 1906/1180, 1913/1916, 1918/3826. 
Elementary French 1909/1196, 1916/2872. 
English Readings 1903/996. 

(c) The order in which the subjects approximate the normal dis- 
tribution is as follows: English Readings, Elementary French, Elemen- 
tary Algebra, Plane Geometry. In Figs. I and II are reproduced 
twenty selected graphs, five for each of the above subjects respectively. 

EFFECT OF YEARLY INCREASE. 

Various reasons suggested themselves as to why the results are so far 
from those expected. Bimodal distributions usually indicate a poor 
selection of cases. As the second mode in every instance is in the lower 
end or failure group, this might be caused by the influx of a large num- 
ber of unprepared persons in the hope of slipping by. This explanation 
is discarded, however, for (a) the data show that this does not occur 
at intervals but appears regularly in all subjects, (5) the yearly increase 
in the number of candidates, with the exception of 1916, has been rela- 
tively constant as is shown in Table I. 

RECOMMENDED CANDIDATES. 

If all candidates of doubtful preparation could be eliminated a 
different result might be obtained. Consequently graphs were made 
for the years 1912-1916 inclusive for "only those candidates who were 
recommended for examinations on the ground of full and satisfactory 
preparation. ' '* 

It was found, however, that 

(a) In Elementary Algebra and Plane Geometry, every distribu- 
tion is bimodal, seven out of every ten are skewed negatively or toward 
the lowest grades, while the other three are skewed positively or toward 
the highest grades. 

(6) Of the five in Elementary French, four are bimodal and three 
are skewed positively. 

(c) In English Readings only one, 1916/2431, is bimodal, aU the 
others tending roughly toward the normal. 



* Further study of the group could not be made^ aa only these limited data are pub- 
lished by the Board. 
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It is very evident from this that there is slight improvement in the 
ratings of the recommended candidates in English Readings and Ele- 
mentary French but none in Elementary Algebra and Plane Geometry. 
The difference, however, is not marked enough to conclude that it is due 
to better preparation. 

TOTAL YEARLY BANKS. 

Theoretically, as the number of cases increases the nearer the dis- 
tribution should correspond to the normal. Graphs were prepared show- 
ing the distribution of the total number of marks given for all subjects 
from 1902 to 1920 inclusive for all candidates, and from 1912 to 1916 
for recommended candidates only. These show that in every case, (a) 
the distribution is bimodal, (6) it is skewed toward the lower end. Fig. 
Ill gives a selected list of graphical representations for totals of different 
years. 

If all of the marks assigned in all subjects from 1902 to 1920 in- 
clusive were combined into one grand total average distribution it would 
be as follows : 

Grand Total 90-100 75-89 60-74 50-59 0-49 

445,620 4.78% 18.34% 31.14% 13.78% 31.96% 

In other words out of 445,620 eases only 4.78% received the highest 
grade while 31.96% failed. How many of the latter tried over again 
and succeeded there are no data to show. 

A grand total average distribution for only those candidates recom- 
mended on the ground of full and satisfactory preparation as published 
for 1912 to 1916 inclusive is 

Grand Total 90-100 75-89 60-74 50-59 0-49 

87,642 6.35% 22.32% 32.28% 13.69% 25.36% 

This is slightly better than the one given above, but considering the 
fact that the individuals involved here were highly selected, a failure of 
one-fourth, or 21,910 cases out of 87,642, places upon the Board the re- 
sponsibility for a condition which is far reaching in its social and eco- 
nomic effects. 

SELECTED DISTRIBUTIONS. 

That the reader may have some samplings of extreme variations as a 
basis of comparison a selected list of graphs is given in Fig. IV These 
are taken from different subjects and different years. The lowest num- 
ber of cases involved is 641 while the highest is 2063. 

WHAT WAS EXPECTED. 

As was said at the beginning of this article, it was expected that the 
results would approximate the normal distribution. Briefly the evi- 



College Entrance Examination Board 



4.3 



31 



14 



33 







30 


14 


37 


4.1 


\5 



3.4 



32 



16 



13 



36 



3.6 



31 



8 



14 



33 



-% 




4.1 



31 



/5'33 



4.1 



30 



19 



15 



32 



3.1 



29 



Id 



15 



35 







31 


15 


3Z 


5.2 
1 , 


17 




31 


\5 


35 




16 




2.9 

1 





Kg. Ill — Totals for different years. Number of marks assigned will be found in 
Table 1. Divisions as in Fig. I. 
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denee supporting this is as follows : (a) Physical differences approximate 
the normal curve* as do mental characteristics, f (b) Marks, represent- 
ing, as they do, estimates of mental abilities, are themselves distributed 
according to the same frequencies as the abilities they are designed to 
represent,! (c) The normal distribution of marks is the one usually 
found when a fairly large number of students are graded. § 

Concluding then that the assignment of any relatively large number 
of grades ought to approximate the normal distribution and steadily so 
as the number Lnereases over 500, this further question remains : What 
is the best method of dividing this distribution into groups for translat- 
ing standing into a scale of marks? After a careful examination of all 
possible schemes we have concluded that the five division one is best. 
This is based on the orientation of a large number of cases aroxmd a 
central group whose accomplishment is considered median or average. 
Above and below lie groups of smaller size containing superior and in- 
ferior students in relation to the average and above and below these the 
still smaller groups of exceptions or failures. 

The method of dividing our theoretical distributions into the five 
divisions which we will represent by the letters A, B, C, D, E, would 
be as follows : Find the median of the distribution and lay off on the base, 
on either side, the distance of 1 P. E. Within the area embraced by this 
±P. E. there will fall 50% of the total number of cases. This would rep- 
resent the center or average or C group. Now lay off on either side of ± 
P. E. a distance equal to 2 P. E. Each one of the areas thus designated 
will contain 23% of these cases,|| and would be represented by the let- 
ters B and D respectively. Again laying off the distance of 2 P. E. on 
either side we will reach the limits of the normal curve as for all practical 
purposes the ordinate may be taken as zero when the abscissa is 5 P. B. 
The last two divisions just made would each contain 2% of the total 
number of cases and would be represented by the letters A and E. The 
relationship between the cases represented by the five divisions of our 
normal probability integral and our marking system would now be as 
follows :T1 

ABODE 
2% 23% 50% 23% 2% 

* Brooks: The Foundation of Zoology, pp. 156-157, and Tule; An Introduction to the 
Theory of Statistics, p. 84. 

t See the distribution of the IQ 's of 905 unseleoted children 5-14 years of age in 
Terman: The Measurement of Intelligence, p. 66. 

t Dearborn : School and University Grades. University of Wisconsin Bulletin 
No. 368. 

5 Dearborn, Ibid, also Foster : The Administration of the College Curriculum, pp. 
250-300. 

II A table of the values of P. E. of the normal probability integral will be found in 
Kugg: Statistical Methods Applied to Education, p. 391. 

Tl This was the division used by Buckingham in the standardization of the Bucking- 
ham Spelling Scale. 
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In like manner if we should lay off on either side of the mean the 
distance of A. D. we would find the following distribution : 

ABODE 

2% 20% 56% 20% 2% 

or if we should take for our unit .5tr and then lay off lo- on either side 
our relationship would be as follows:* 

ABODE 

7% 24% 38% 24% 7% 

What is more commonly used by writers than either of the two pre- 
ceding is to lay off the distance Q on each side of the mean. We would 
then have :f 

ABODE 

3% 22% 50% 22% 3% 

One of the first thoro treatments of variation in the marking of 
examinations was published by an English economist, Professor F. Y. 
Edgeworth, in the Journal of the Royal Statistical Society, September, 
1888. This paper showed that there is a probable error of 3% and a pos- 
sible error of 9%, in assigning a mark as representative of a student's 
real proficiency. Professor Edgeworth argued as a remedy that marks 
should be distributed according to the normal probability curve, but 
offered no suggestions as to its division. Many of the later writers, 
however, made definite divisions as given below : 
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* This was the division used by Ayres in the construction of the Ayres Spelling Scale, 
t Tables of the values of AD, cr and Q of the normal probability integral will be 
found in Thorndike: Mental and Social Measurements, pp. 219 220. 

t Professor Oattell recognized the P. E. distribution of cases. He altered the per- 
centages to more nearly meet the needs of clalssroom teachers who deal with 
small numbers, usually not exceeding 40. 
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A study of Figures 1 to IV inclusive will show no such relationship 
between the percentage of cases in the five divisions as is brought out 
here. Indeed one is amazed at the remarkable extent of divergence. 

EFFECT OF READING METHODS ON THE DISTRIBUTION. 

A number of examiners and readers have been consulted, from 
whom the following facts have been ascertained : 

(a) Any paper marked between 50 and 60 by a reader is re-read 
by one or more before a permanent rating is given. This is due to the 
fact that the passing mark for some of the larger universities is 60 while 
that of many smaller colleges is 50. The re-examination of the paper 
is to determine whether the writer shows sufficient actual knowledge of 
subject matter and indicates enough potential possibilities of develop- 
ment to profit by the work offered in that department of a large univer- 
sity. If in the opinion of the examiner he does not, then the mark is 
below 60 which will admit only to the smaller colleges. 

(6) Any paper marked over 90 by a reader is re-read by one or 
more readers before it is given its final mark. This is due to the fact 
that many prizes depend upon the highest awards. 

(c) Any paper originally marked between 60 and 90 is never re- 
read except in rare instances when the rating is only a few points above 
60. 

(d) At the beginning the examiners agree on a value to be assigned 
to each question. There are two different methods of determining this. 
In some cases it is arrived at as follows: (1) Accepting 100 as the highest 
possible score, when there are ten questions each is given a value of 10. 
If there are eight questions each is given a value of I21/2. When there 
are two or more parts to any question each part is given a proportion of 
the value assigned to the question as a whole, i. e. if there were ten ques- 
tions the value of each would be 10. If one were divided into two parts, 
5 would be given to each part. (2) In other instances the rating assigned 
is arrived at by taking the composite evaluation of each question by 
the readers. A clear exposition of this method as applied to French 
will be found in an article by Professor Donald C. Stuart of Princeton 
in the Bulletin of the New England Modern Language Association, Sep- 
tember 1917. 

That this method of reading the papers is a contributing cause of 
the poor distribution of marks is evident for, (a) no conferences are held 
between the examiners and readers to agree on the interpretation and 
value to be assigned to questions, (ft) no attempt is made to standardize 
values of questions by considering the percentage of answers correct 
or incorrect, (c) the principle is not recognized that the assignment of 
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marks aggregating 1000 to 5000 in a subject, or 11,000 to 44,000 for a 
yearly total, ought to conform to the curve of error and hence no at- 
tempt is made to check up or correct results on the basis of the normal 
distribution. 

CONCLUSION. 

The facts seem to show clearly that, (a) only in rare instances, in 
the subjects studied, does the assignment of marks nearly approximate 
the normal, (b) the same condition holds true for the annual total for 
all subjects, (c) the results in cases where the pupils taking the examina- 
tion are recommended by their school authorities on the ground of full 
and satisfactory preparation are only slightly improved, (d) this cannot 
be due to an influx of unprepared candidates as the increase in numbers 
each year is relatively constant and the poor distribution is found an- 
nually from 1902 to date, (e) the method of reading and scoring the 
papers, especially the lack of standardization of values and corrections 
in conformity with the curve of error, is a very natural factor in causing 
the existing conditions, (f ) the suggestion is made that some approxima- 
tion to the normal curve offers the best basis for solving present irregu- 
larities. This need not affect the passing marks as they may still be de- 
termined by such principles as govern them at the present time, altho a 
reconsideration of these might well be made by the Board. 

Finally, in view of the large number of cases, no sufficient justifica- 
tion exists for the wide difference in the relative percentages assigned 
in the different subjects. Whether the distribution approximates the 
curve of error, or some other form, a certain uniformity in the different 
subjects may reasonably be expected. To accomplish this there must 
be co-operation between examiners and readers in the different subjects. 

The writer wishes to emphasize the fact that this article does not 
claim to present an exhaustive study of the marks given by the College 
Entrance Examination Board. There are many phases of the subject 
which have not been touched. Sufficient evidence has been produced, 
however, to show the existence of an unwarranted condition and it is 
hoped the movement already inaugurated by the Board will result 
in a definite, workable plan for improvement. 
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